Intel gave the first architectural details of its Gaudi 3 third-generation AI processor at Vision 2024 this week in Phoenix, Arizona. Gaudi 3 is made up of two identical silicon dies, each with a central region of 48 megabytes of cache memory, joined by a high-bandwidth connection, surrounded by four engines for matrix multiplication and 32 programmable units called tensor processor cores. It produces double the AI compute of Gaudi 2 using 8-bit floating-point infrastructure. It also provides a fourfold boost for computations using the BFloat 16 number format.
Wednesday, April 10, 2024Meta plans to release an initial version of its next-generation Llama 3 large language model within the next month. The company will release a number of different models with different capabilities and versatilities during the course of the year. Llama 3 will be able to answer a wider range of questions compared to its predecessor, including questions regarding more controversial topics. Meta has not released any details about the model's size, but it is expected to have about 140 billion parameters - the biggest Llama 2 model has 70 billion.
Google's Gemini Code Assist is an enterprise-focused AI code completion and assistance tool. It was previously offered under the now-defunct Duet AI branding which became generally available in late 2023. Code Assist is both a rebrand and a major update. It uses Gemini 1.5 Pro, which has a million-token context window. Code Assist will be available through plug-ins for popular editors like VS Code.
Intel gave the first architectural details of its Gaudi 3 third-generation AI processor at Vision 2024 this week in Phoenix, Arizona. Gaudi 3 is made up of two identical silicon dies, each with a central region of 48 megabytes of cache memory, joined by a high-bandwidth connection, surrounded by four engines for matrix multiplication and 32 programmable units called tensor processor cores. It produces double the AI compute of Gaudi 2 using 8-bit floating-point infrastructure. It also provides a fourfold boost for computations using the BFloat 16 number format.
In this article, OpenAI's Evan Morikawa provides insights into ChatGPT's inner workings, covering input text processing and tokenization to prediction sampling using large language models. ChatGPT operates by turning tokens into numerical vectors (embeddings), multiplying them by a weight matrix of billions, and selecting the most probable next word. The tech is grounded in extensive pretraining to predict text based on vast internet data.
Adobe researchers have introduced VideoGigaGAN, an AI model that achieves up to 8 times video upsampling with precise details and temporal stability, surpassing traditional video super-resolution methods. The model, built upon GigaGAN, addresses past challenges by improving temporal consistency and detail retention in videos, setting a new standard in the field.
Making a future where large language models enhance daily life, provide business productivity and entertainment, and help people with everything requires highly efficient inference. Character.AI designs its model architecture, inference stack, and product from the ground up to enable unique opportunities to optimize inference to be more efficient, cost-effective, and scalable. The company serves more than 20,000 inference queries per second. It is able to sustainably serve models at this scale because it has developed a number of key innovations across its serving stack. This blog post shares some of the techniques and optimizations Character.AI has developed.
Samsung's mobile chief, TM Roh, outlined the vision for Galaxy AI at a Paris event that showcased new AI features in Samsung products, including an evolution of Bixby driven by Samsung's LLMs. Roh discussed potential monetization strategies, including subscriptions for AI services, though plans are still under consideration.
Following a limited release to select Vertex AI users, Google's advanced text-to-image AI model, Imagen 3, is now available to all U.S. users through the ImageFX platform. This quiet expansion has sparked mixed reactions, with some praising the tool's improved texture and word recognition capabilities while others criticize its strict content filters.
AI technology has made significant strides, particularly with Google's recent update to NotebookLM, which allows users to create podcasts from their written content. This feature, known as Audio Overview, enables two AI hosts to engage in a lively discussion based on the user's material, summarizing key points and making connections in a conversational format. The tool is designed to help users make sense of complex information by grounding its responses in the uploaded content, complete with citations and relevant quotes. The excitement surrounding this update stems from its impressive capabilities. Users have reported that the AI-generated podcasts are surprisingly good, capturing the essence of their essays and presenting them in an engaging manner. The technology combines natural voice synthesis, emotional expression, and a deep understanding of language, resulting in a product that feels both human-like and informative. The AI hosts can discuss intricate ideas and nuances, making the content accessible and enjoyable to listen to. Despite the tool's effectiveness, there are questions about why Google has not heavily promoted it. Some speculate that the company may be cautious about potential misuse of voice technology, while others believe that Google is intentionally downplaying the product to avoid the pitfalls of overhyping. Instead, Google seems to be relying on its vast user base and the organic spread of information through social media to generate interest. Feedback from users has been overwhelmingly positive, with many expressing surprise at the quality of the podcasts. While some minor inaccuracies have been noted, the overall impression is that the AI does an excellent job of summarizing and presenting the original material. The experience of hearing one's work transformed into a podcast can evoke strong emotions, akin to receiving recognition from traditional media. In conclusion, Google's NotebookLM represents a significant advancement in AI technology, offering a unique tool for content creators. By transforming written work into engaging audio discussions, it opens up new possibilities for how information can be shared and consumed. As users continue to explore its capabilities, the implications for content creation and dissemination are likely to evolve, prompting further discussions about the role of AI in our lives.